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Amendments to Claims 
Please amend the claims as set forth hereafter. 

Claims 1-34 Cancelled 

35. (Currently amended) A method of discovering one or more patterns in a set of k 
sequences of symbols, called a k-tuple, where k is greater than or equal to two, within an 
overall set of w sequences having sequence numbers 1, 2, . . w , the symbols being members 
of an alphabet, each sequence of symbols having respective lengths Li, L2, . . Lw, comprising 
the steps of: 

a) translating the sequences of symbols into a table of ordered (symbol, position index) 
pairs, where the position index refers to the location of the symbol in a sequence; 

b) for each of the w sequences, grouping the (symbol, position index) pairs by symbol 
to form a respective master offset table, thus creating w master offset tables; 

c) using the position indices in the w master offset tables to determine the difference- 
in-position value between each occurrence of a symbol in one of the sequences and each 
occurrence of that same symbol in the other sequence in each master offset table , forming a 
k-tuple table associated with the k-tuple, the table comprising k columns, one of the k 
columns being a primary column and the remaining (k-1) columns being suffix columns, each 
column corresponding to one of the k sequences; 

i) the primary column comprising the (symbol, position index) pairs of 
a primary sequence, 

ii) the (k-1) suffix columns comprising (symbol, difference-in-position 
value) pairs, where the difference-in-position values are the position 
differences between all same symbols of each remaining sequence of the tuple 
and the primary sequence of the tuple, 

iii) the rows in the k-tuple table resulting from forming all 
combinations of same symbols from each sequence; 

d) creating a sorted k-tuple table by performing a multi-key sort on the k-tuple table, 
the sort keys being selected respectively from the difference-in-position values of the last 
suffix column (k th column) through the difference-in-position value of the first suffix column; 
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e) d e fining identifying one or more patterns by collecting adjacent rows of the sorted 
k-tuple table whose suffix columns contain identical difference-in-position values, the relative 
positions of the symbols in each pattern being determined by the primary column position 
indices, the one or more patterns being common to the k sequences ; and 

f) reading out the identified one or more patterns to a user . 

36. (Currently amended) The method of claim 35 further comprising: 
g) deleting all patterns not satisfying a predetermined criteria. 

37. (Currently amended) The method of claim 35 further comprising: 

£) g) deleting all patterns shorter than a first predetermined span and longer than a 
second predetermined span. 

38. (Currently amended) The method of claim 35 further comprising: 

f) g) deleting all patterns having fewer than a predetermined number of symbols. 

39. (Original) The method of claim 35, further comprising the step of deleting rows 
from the k-tuple table which do not have suffix indices identical to any other row of the k- 
tuple table. 

40. (Original) The method of claim 35 further comprising the step of deleting rows 
from the k-tuple table according to predetermined criteria. 

41 . (Previously presented) The method of claim 40, wherein rows sharing identical 
suffix column difference-in-position values are deleted from the k-tuple table if there are 
fewer than N s such rows, where N s is the minimum number of symbols per pattern. 

42. (Currently amended) A method of discovering one or more patterns in a set of 
k+1 sequences of symbols, called a (k+l)-tuple, where k is greater than or equal to two, 
within an overall set of w sequences having sequence numbers 1,2, . . w , the symbols being 
members of an alphabet, each sequence of symbols having respective lengths Lj, L 2j ...,L W 
by first forming a k-tuple table and then forming a (k+l)-tuple table by combining the k-tuple 
table with an additional sequence of symbols, the formation of the k-tuple table comprising 
the steps of: 
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a) translating the sequences of symbols into a table of ordered (symbol, position index) 
pairs, where the position index refers to the location of the symbol in a sequence; 

b) for each of the w sequences, grouping the (symbol, position index) pairs by symbol 
to form a respective master offset table, thus creating w master offset tables; 

c) using the position indices in the w master offset tables to determine the difference- 
in-position value between each occurrence of a symbol in one of the sequences and each 
occurrence of that same symbol in the other sequence in each master offset table , forming a 
k-tuple table associated with the k-tuple, the table comprising k columns, one of the k 
columns being a primary column and the remaining (k-1) columns being suffix columns, each 
column corresponding to one of the k sequences; 



i) the primary column comprising the (symbol, position index) pairs of 
a primary sequence, 

ii) the (k-1) suffix columns comprising (symbol, difference-in-position 
value) pairs, where the difference-in-position values are the position 
differences between all same symbols of each remaining sequence of the tuple 
and the primary sequence of the tuple, 

iii) the rows in the k-tuple table resulting from forming all 
combinations of same symbols from each sequence; 



d) creating a sorted k-tuple table by performing a multi-key sort on the k-tuple table, 
the sort keys being selected respectively from the difference-in-position values of the last 
suffix column (k A column) through the difference-in-position value of the first suffix column; 

the formation of the (k+l)-tuple table comprising the steps of: 

e) translating the additional sequence of symbols into a table of ordered (symbol, 
position index) pairs, where the position index refers to the location of the symbol in the 
additional sequence of symbols; 

f) grouping the (symbol, position index) pairs by symbol to form a master offset table; 

g) creating the (k+l)-tuple table of k+1 columns, one of the k+1 columns being a 
primary column and the remaining k columns being suffix columns, by: 



i) forming all combinations of same symbols between the primary 
column of the k-tuple table and the master offset table, 

ii) for each such combination, duplicating the corresponding row of the k- 
tuple table, and appending a (symbol, difference-in-position value) pair 
corresponding to the difference between the position index of the master offset 
table and the position index of the primary column; 
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h) creating a sorted (k+l)-tuple table by performing a multi-key sort on the (k+l)-tuple 
table, the sort keys being selected respectively from the difference-in-position values of the 
last suffix column [(k+l) 1 * 1 column] through the difference-in-position value of the first suffix 
column; and 

i) d e fining identifying one or more patterns by collecting adjacent rows of the sorted 
(k+l)-tuple table whose suffix columns contain identical difference-in-position values, the 
relative positions of the symbols in each pattern being determined by the primary column 
position indices, the one or more patterns being common to the k+1 sequences ; and 

j) reading out the identified one or more patterns to a user . 

43. (Currently amended) The method of claim 42 further comprising the step of: 

j) k) deleting patterns from a k-tuple table common to the k-tuple table and a (k+1)- 
tuple table, where the (k+l)-tuple table contains all of the sequences of the k-tuple table with 
one additional sequence, by: 

i) deleting the suffix column corresponding to a sequence not shared 
between the two tuple tables, thereby defining a modified table, and 

ii) deleting all rows from the k-tuple table whose suffix columns all 
contain identical sets of difference-in-position values to a row of the modified 
table. 

44. (Currently amended) A method of discovering one or more patterns in a set of k 
sequences of symbols, called a k-tuple, comprising the steps of: 

a) for a first pair of sequences of the k-tuple 

i) translating each sequence of symbols into a table of ordered (symbol, 
position index) pairs, where the position index of each (symbol, position index) 
pair refers to the location of the symbol in the sequence; 

ii) for each of the paired sequences, grouping the (symbol, position 
index) pairs by symbol to respectively form a first master offset table and a 
second master offset table; 

iii) forming a Pattern Map comprising an array having (LI + L2 -1) 
rows by: 

A) subtracting the position index of the first master 
offset table from the position index of the second master offset 
table for every combination of (symbol, position index) pair 
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having same symbols, the difference-in-position value resulting 
from each subtraction defining a row index; 



B) storing each (symbol, position index) pair from the 
first master offset table in a row of the Pattern Map, the row 
being defined by the row index, until all (symbol, position 
index) pairs have been stored in the Pattern Map; 



iv) d e fining identifying a parent pattern by collecting symbols having 
the identical difference-in-position value from each row of the Pattern Map and 
populating an output array with the collected symbols, the symbols being 
placed at relative locations in the parent pattern indicated by the position index 
of the (symbol, position index) pair; and 

v) repeating step iv) for each row of the Pattern Map; 

b) storing the discov e r e d identified patterns as arrays of (symbol, position index) pairs; 

c) for each subsequent pair of sequences of the k-tuple, replacing the (symbol, position 
index) pairs of the first sequence of the pair of sequences by the (symbol, position index) pairs 
of the stored patterns; and 

d) repeating steps (a) through (c) for each subsequent pair of sequences until the k-th 
sequence of the k-tuple is reached ; and 

e) reading out the identified one or more patterns to a user . 

45. (Currently amended) The method of claim 35, further comprising the step of 
finding all patterns at all levels of support within a set of sequences by: 

f) g) forming a tree of nodes, where each node corresponds to each combination of k 
sequences, and therefore represents a k-tuple, and wherein each node representing a k-tuple is 
connected to all nodes representing (k+l)-tuples, 

each (k+l)-tuple being formed by adding a unique sequence to the k-tuple, where the 
sequence being added is later in the ordered list of sequences than the latest sequence of the k- 
tuple; 

g) h) traversing the tree, and at each node visited during traversal, defining one or 
more patterns by collecting adjacent rows of the sorted k-tuple table whose suffix columns 
contain identical sets of difference-in-position values, the relative positions of the symbols in 
each pattern being determined by the primary column position indices, the one or more 
patterns being common to the k sequences. 
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46. (Original) The method of claim 45, wherein the traversal of the tree of nodes is 
accomplished via recursion. 

47. (Currently amended) The method of claim 45, further comprising the step of: 
h) i) removing duplicate patterns at each level of support. 

48. (Currently amended) The method of claim 47, wherein the removal of duplicate 
patterns at each level of support step h) i) is accomplished by: 

i) I) for each node corresponding to a (k+l)-tuple, identifying the nodes 
containing k-tuples whose sequences are subsets of the (k+l)-tuple; thereby defining a 
set of causally-dependent nodes; 

ii) II) locating said causally-dependent nodes; 

III) removing from each said causally-dependent node the patterns in 
common with the (k+l)-tuple; and 

iv) IV) if the k-tuple table in a causally-dependent node is thereby reduced to 
zero length, removing the corresponding node and all of its descendents from the tree 
prior to their traversal 

49. (Currently amended) The method of claim 48, wherein locating causally- 
dependent nodes in step «) II) comprises the steps of: 



(A) organizing the nodes at level k in the Tuple-tree into a linked list 
which is ordered from left to right in accordance with the sequence numbers 
represented by each tuple; and 

(B) searching said linked list for nodes which are causally-dependent on 
a particular (k+l)-tuple. 



50. (Currently amended) The method of claim 48, wherein the nodes located in step 
«) II) are causally-dependent nodes at level k determined with respect to another node at 
level k, and are thus causally-dependent on a child of the another node at level k. 

51 . (Currently amended) The method of claim 47, wherein the removal of duplicate 
patterns at each level of support step h) i) comprises the steps of: 

i) I) organizing the nodes at level k in the Tuple-tree into a linked list which is 
ordered from left to right in accordance with the sequence numbers of each tuple; 
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ii) II) for each pattern in the current node at level k, storing a "hit list" array 
of the sequence numbers of the sequences containing the pattern; 

iii) III) for all nodes to the right of the current node whose sequence numbers 
are all in the hit list, searching for a duplicate instance of the pattern, and if found, 
eliminating it; and 

iv) IV) making each node the current node, repeating steps (ii) and (iii) (II) and 
(HI) , in the order of the node's appearance in the linked list. 

52. (Currently amended) The method of claim 51, wherein, in step iii) III) , the 
nodes consistent with the hit list are found using a hash tree, the hash tree having a root and k 
levels of nodes, the k-th level of the hash tree having a plurality of leaf nodes, the respective 
level of nodes of the hash tree corresponding to the respective sequence numbers of a k-tuple, 
the leaf nodes identifying the k-tuple whose sequence numbers correspond to the path from 
the root to the leaf node, wherein 

searching the nodes for pattern duplicates is performed by repeating steps (ii) and (iii) 
(II) and (III) for each node in the order of the appearance of that node in the hash tree. 

53. (Currently amended) The method of claim 45 wherein the traversing step g) h) 
itself comprises the steps of: 

i) creating a Virtual Sequence Array of patterns found within the sequences, wherein 
the patterns are termed P-nodes and the tuple nodes are termed T-nodes, 

(ii) finding a P-node list corresponding to the location of each pattern in the primary 
sequence of that tree node, 

iii) searching the P-node list for a duplicate instance of the pattern, 
(A) if no duplicate is found: 



(1) adding a pointer to the pattern of the current T-node pattern 

array, 

(2) finding all locations of the pattern within the Virtual 
Sequence Array, 

(3) adding a pointer to the pattern to each corresponding P-node 

array; 

(B) if a duplicate pattern is found: 

(1) ignoring the pattern if the duplicate pattern was found at 
support equal to the current level of support, 
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(2) if the duplicate pattern was found at a previous level of 
support, unlinking the duplicate pattern from its previous T-node (if it 
exists), and relinking the duplicate pattern to the current T-node, 

(3) repeating steps 1) and 2) until all of the children of a T-node 
have been created, thus insuring that patterns of that T-node that are at 
their ultimate level of support are reported, and 

(4) deleting the T-node. 

Claims 54-65 Cancelled 

66. (Previously presented) A computer-readable medium containing a plurality of 
data structures useful in controlling a computer system to discover one or more patterns in k 
sequences of symbols within an overall set of w sequences, the plurality of data structures 
comprising: 

a number w of master offset table data structures each grouping, 

for each value of a difference in position between each occurrence of a 
symbol in one of the sequences and each occurrence of that same symbol in each 
other sequence, 

the position (position index) in the first sequence of each symbol therein 
that appears in each of the other sequences at that difference-in-position value; 

a k-tuple table data structure comprising columns and rows, the columns comprising 
(symbol, position index) pairs and (symbol, difference-in-position value) pairs; and 

a sorted k-tuple table data structure comprising a row-sorted representation of the 
(symbol, position index) pairs and (symbol, difference-in-position value) pairs contained in 
the k-tuple table data structure, 

wherein adjacent rows of the sorted k-tuple table data structure whose suffix columns 
contain identical difference-in-position values define one or more patterns of symbols, the 
relative positions of symbols in each pattern being determined by the primary column position 
indices in the sorted k-tuple table data structure. 

67. (Previously presented) The computer-readable medium of claim 66 wherein the 
sorted k-tuple table data structure further groups, for each value of a difference in position, the 
number of symbols in the first sequence that appear in the second sequence at that difference- 
in-position value. 
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68. (Currently amended) A computer-readable medium containing instructions for 
controlling a computer system to discover one or more patterns in a set of k sequences of 
symbols, called a k-tuple, where k is greater than or equal to two, within an overall set of w 
sequences having sequence numbers 1,2, . . ., w , the symbols being members of an alphabet, 
each sequence of symbols having respective lengths Li, L2, . . ., L w , by executing a method 
comprising the steps of: 

a) translating the sequences of symbols into a table of ordered (symbol, position index) 
pairs, where the position index refers to the location of the symbol in a sequence; 

b) for each of the w sequences, grouping the (symbol, position index) pairs by symbol 
to form a respective master offset table, thus creating w master offset tables; 

c) using the position indices in the w master offset tables to determine the difference- 
in-position value between each occurrence of a symbol in one of the sequences and each 
occurrence of that same symbol in the other sequence in each master offset table, forming a k- 
tuple table associated with the k-tuple, the table comprising k columns, one of the k columns 
being a primary column and the remaining (k-1) columns being suffix columns, each column 
corresponding to one of the k sequences; 

i) the primary column comprising the (symbol, position index) pairs of 
a primary sequence, 

ii) the (k-1) suffix columns comprising (symbol, difference-in-position 
value) pairs, where the difference-in-position values are the position 
differences between all same symbols of each remaining sequence of the tuple 
and the primary sequence of the tuple, 

iii) the rows in the k-tuple table resulting from forming all 
combinations of same symbols from each sequence; 

d) creating a sorted k-tuple table by performing a multi-key sort on the k-tuple table, 
the sort keys being selected respectively from the difference-in-position values of the last 
suffix column (k th column) through the difference-in-position value of the first suffix column; 
and 

e) d e fining identifying a one or more patterns by collecting adjacent rows of the sorted 
k-tuple table whose suffix columns contain identical difference-in-position values, the relative 
positions of the symbols in each pattern being determined by the primary column position 
indices, the one or more patterns being common to the k sequences ; and 

f) reading out the identified one or more patterns to a user . 



15 



